Skip to content

DOC: update the pandas.DataFrame.notna and pandas.Series.notna docstring #20160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 13, 2018

Conversation

datadonK23
Copy link

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

Two Validations for pandas.DataFrame.notna and pandas.Series.notna (shared docs).

################################################################################
###################### Docstring (pandas.DataFrame.notna) ######################
################################################################################

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
NA values, such as None or :attr:`numpy.NaN`, get mapped to False
values.

Returns
-------
bool of type DataFrame
    Mask of True/False values for each element in DataFrame that
    indicates whether an element is not an NA value

See Also
--------
DataFrame.notnull : alias of notna
DataFrame.isna : boolean inverse of notna
DataFrame.dropna : omit axes labels with missing values
notna : top-level notna

Examples
--------
Show which entries in a DataFrame are not NA.

>>> df = pd.DataFrame({'age': [5, 6, np.NaN],
...                    'born': [pd.NaT, pd.Timestamp('1939-05-27'),
...                             pd.Timestamp('1940-04-25')],
...                    'name': ['Alfred', 'Batman', ''],
...                    'toy': [None, 'Batmobile', 'Joker']})
>>> df
   age       born    name        toy
0  5.0        NaT  Alfred       None
1  6.0 1939-05-27  Batman  Batmobile
2  NaN 1940-04-25              Joker

>>> df.notna()
     age   born  name    toy
0   True  False  True  False
1   True   True  True   True
2  False   True  True   True

Show which entries in a Series are not NA.

>>> ser = pd.Series([5, 6, np.NaN])
>>> ser
0    5.0
1    6.0
2    NaN
dtype: float64

>>> ser.notna()
0     True
1     True
2    False
dtype: bool

################################################################################
################################## Validation ##################################
################################################################################

Docstring for "pandas.DataFrame.notna" correct. :)
################################################################################
####################### Docstring (pandas.Series.notna)  #######################
################################################################################

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
NA values, such as None or :attr:`numpy.NaN`, get mapped to False
values.

Returns
-------
bool of type Series
    Mask of True/False values for each element in Series that
    indicates whether an element is not an NA value

See Also
--------
Series.notnull : alias of notna
Series.isna : boolean inverse of notna
Series.dropna : omit axes labels with missing values
notna : top-level notna

Examples
--------
Show which entries in a DataFrame are not NA.

>>> df = pd.DataFrame({'age': [5, 6, np.NaN],
...                    'born': [pd.NaT, pd.Timestamp('1939-05-27'),
...                             pd.Timestamp('1940-04-25')],
...                    'name': ['Alfred', 'Batman', ''],
...                    'toy': [None, 'Batmobile', 'Joker']})
>>> df
   age       born    name        toy
0  5.0        NaT  Alfred       None
1  6.0 1939-05-27  Batman  Batmobile
2  NaN 1940-04-25              Joker

>>> df.notna()
     age   born  name    toy
0   True  False  True  False
1   True   True  True   True
2  False   True  True   True

Show which entries in a Series are not NA.

>>> ser = pd.Series([5, 6, np.NaN])
>>> ser
0    5.0
1    6.0
2    NaN
dtype: float64

>>> ser.notna()
0     True
1     True
2    False
dtype: bool

################################################################################
################################## Validation ##################################
################################################################################

Docstring for "pandas.Series.notna" correct. :)

values.
Everything else get mapped to False values. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this attr work? I don't know if its in our API docs. I think it'd be ok to jsut have unless you set ``pandas.options.mode.use_inf_as_na = True``

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have seen :attr: references throughout other docstrings (frame.py). I can change it, thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the attr itself, but the fact there is nothing to link to I think?

I would rather make this ``pandas.options.mode.use_inf_as_na = True``

Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about the pandas option.


Returns
-------
bool of type %(klass)s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this type annotation right? It looks like bool is the return variable name (which is not needed).
https://python-sprints.github.io/pandas/guide/pandas_docstring.html#section-3-parameters

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A complex type then? So it should be somehting like "dict of int", but it looks like it's inverted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get a DataFrame or a Series that contains boolean values.
It feels like writing DataFrame of boolor Series of bool may be harder to userstand for new users. Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidelines specify that complex types should be written like list of bool. Maybe you can just say that the return type is DataFrame and in the explanation specify that its dtype is bool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type should be just %(klass)s

%(klass)s
    Each element of the %(klass)s will be a boolean.

Copy link
Contributor

@villasv villasv Mar 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's the closest to the guidelines.

@pep8speaks
Copy link

pep8speaks commented Mar 12, 2018

Hello @Donk23! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 13, 2018 at 13:49 Hours UTC

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice docstrings! Added a very minor comment

values.
Everything else get mapped to False values. Characters such as empty
strings `''` or :attr:`numpy.inf` are not considered NA values
(unless you set :attr:`pandas.options.mode.use_inf_as_na` `= True`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the attr itself, but the fact there is nothing to link to I think?

I would rather make this ``pandas.options.mode.use_inf_as_na = True``

Return a boolean same-sized object indicating if the values are NA.
NA values, such as None or :attr:`numpy.NaN`, get mapped to True
values.
Everything else get mapped to False values. Characters such as empty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get -> gets ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry, not my language. Fixed the typos and changed the link as proposed in .notna docstring.

@codecov
Copy link

codecov bot commented Mar 13, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@e7e0ea8). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #20160   +/-   ##
=========================================
  Coverage          ?   91.76%           
=========================================
  Files             ?      150           
  Lines             ?    49146           
  Branches          ?        0           
=========================================
  Hits              ?    45097           
  Misses            ?     4049           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.14% <100%> (?)
#single 41.9% <0%> (?)
Impacted Files Coverage Δ
pandas/core/generic.py 95.85% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e7e0ea8...744dc61. Read the comment docs.

@jorisvandenbossche jorisvandenbossche merged commit b7b00c5 into pandas-dev:master Mar 13, 2018
@jorisvandenbossche
Copy link
Member

@Donk23 Thanks a lot!

If possible, welcome to check the online docs in a few hours to see if everything is looking OK: http://pandas-docs.github.io/pandas-docs-travis/generated/pandas.DataFrame.notna.html#pandas.DataFrame.notna

@datadonK23 datadonK23 deleted the docstring_notna branch March 14, 2018 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants